A Category Based Approach for Recognitionof Out - of - Vocabulary
نویسندگان
چکیده
Das diesem Bericht zugrundeliegende Forschungsvorhaben wurde mit Mitteln des Bundesministers f ur Bildung, Wissenschaft, Forschung und Technologie unter dem F orderkennzeichen 01 IV 102 H/0 gef ordert. Die Verantwortung f ur den Inhalt dieser Arbeit liegt bei den Autoren. ABSTRACT In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a signiicant amount of out-of-vocabulary words even when the vocabulary size is very large. In this paper we present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to deene a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate.
منابع مشابه
The Sociological Effects of Peer/ Teacher Technology-Enhanced Scaffolding through Process Approach on Young Male vs. Female EFL Learners’ Vocabulary Knowledge
Gender is considered a sociological construct and investigating the role of gender in foreign language learning contexts is highly important due to the effects of sociological factors in learning. Therefore, the present study set out to explore the sociological effects of peer and teacher scaffolding through the process approach in a technology-enhanced environment on the vocabulary learning of...
متن کاملAgile Development of a Custom-Made Vocabulary Mobile Application: A Critical Qualitative Approach
There have been some observed studies and developed applications (apps), with a concentration on Mobile Assisted Language Learning (MALL), and no consideration of communicative needs of the learners; besides, these studies focused on either the theoretical aspects or the utilization of the available apps in the market (Burston & Athanasiou, 2020). Hence, Vocabulary Guru (VG), a custom-made mobi...
متن کاملA category based approach for recognition of out-of-vocabulary words
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. In this paper we present a new approach for the integration of out-of-vocabulary words into statistical language models. We use ...
متن کاملRecognition of Out-of-vocabulary Words and Their Semantic Category
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a signiicant amount of out-of-vocabulary (OOV) words even when the vocabulary size is very large. In this paper we present a new approach for the integration of OOV words into statistical language models. It is based on t...
متن کاملClassifying Out-of-vocabulary Terms in a Domain-Specific Social Media Corpus
In this paper we consider the problem of out-of-vocabulary term classification in web forum text from the automotive domain. We develop a set of nine domainand application-specific categories for out-of-vocabulary terms. We then propose a supervised approach to classify out-of-vocabulary terms according to these categories, drawing on features based on word embeddings, and linguistic knowledge ...
متن کامل